Chinese Word Sense Disambiguation based on Context Expansion

نویسندگان

  • Zhizhuo Yang
  • Heyan Huang
چکیده

Word Sense Disambiguation (WSD) is one of the key issues in natural language processing. Currently, supervised WSD methods are effective ways to solve the ambiguity problem. However, due to lacking of large-scale training data, they cannot achieve satisfactory results. In this paper, we suppose synonyms for context words that can provide more knowledge for WSD task, and present two different WSD methods based on context expansion. The first method regards Synonyms as topic contextual feature to train Bayesian model. The second method treats context words made up of synonyms as pseudo training data, and then derives the meaning of ambiguous words using the knowledge from both training and pseudo training data. Experimental results show that the second method can significantly improve traditional WSD accuracy by 2.21%. Furthermore, it also outperforms the best system in SemEval-2007.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Unsupervised Approach to Chinese Word Sense Disambiguation Based on Hownet

The research on word sense disambiguation (WSD) has great theoretical and practical significance in many fields of natural language processing (NLP). This paper presents an unsupervised approach to Chinese word sense disambiguation based on Hownet (an electronic Chinese lexical resource). In our approach, contexts that include ambiguous words are converted into vectors by means of a second-orde...

متن کامل

SRCB-WSD: Supervised Chinese Word Sense Disambiguation with Key Features

This article describes the implementation of Word Sense Disambiguation system that participated in the SemEval-2007 multilingual Chinese-English lexical sample task. We adopted a supervised learning approach with Maximum Entropy classifier. The features used were neighboring words and their part-of-speech, as well as single words in the context, and other syntactic features based on shallow par...

متن کامل

Joining automatic query expansion based on thesaurus and word sense disambiguation using WordNet

The selection of the most appropriate sense of an ambiguous word in a certain context is one of the main problems in Information Retrieval (IR). For this task, it is usually necessary to count on a semantic source, that is, linguistic resources like dictionaries, thesaurus, etc. Using a methodology based on simulation under a vector space model, we show that the use of automatic query expansion...

متن کامل

Ontology Based Query Expansion Using Word Sense Disambiguation

The existing information retrieval techniques do not consider the context of the keywords present in the user’s queries. Therefore, the search engines sometimes do not provide sufficient information to the users. New methods based on the semantics of user keywords must be developed to search in the vast web space without incurring loss of information. The semantic based information retrieval te...

متن کامل

Context Expansion with Global Keywords for a Conceptual Density-Based WSD

The resolution of the lexical ambiguity, which is commonly referred to as Word Sense Disambiguation, is still an open problem in the field of Natural Language Processing. An approach to Word Sense Disambiguation based on Conceptual Density (a measure of the correlation between concepts) obtained good results with small context windows. This paper presents a method to integrate global knowledge,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012